About me
Hi! I’m Ying. I am an Assistant Professor in the Department of Statistics and Data Science at the Wharton School, University of Pennsylvania.
I obtained my PhD in Statistics from Stanford University in 2024, advised by Professors Emmanuel Candès and Dominik Rothenhäusler. Prior to that, I studied Mathematics at Tsinghua University. Before joining Wharton, I spent one year as a Wojcicki-Troper Postdoctoral Fellow at Harvard Data Science Initiative, where I had the fortune to work with Professor José Zubizarreta and Professor Marinka Zitnik.
I currently help organize the Online Causal Inference Seminar.
Research interests
I work on statistical problems related to two main themes:
- Uncertainty quantification
I study distribution-free inference for quantifying and controlling the uncertainty of black-box AI models. My recent interest is statistical inference guarantees in scenarios where scientific discoveries are driven/generated by AI predictions, motivated by applications in predictive screening in drug discovery, generative medical AI, and automated scientific discovery with AI agents.
This often necessitates inference across multiple, decision-coupled samples, and leads to
new conformal prediction methods with selective inference and causal inference capabilities.
News
- Sep 2025: Our Pessimistic Policy Learning paper is selected by Annals of Statistics to present at the journal-to-conference track at NeurIPS 2025!
- Sep 2025: Our paper on the predictive role of covariate shift in generalizability is accepted to PNAS! Analyzing two large-scale multi-site replication projects, it suggests a predictive, instead of explanatory, role of covariate shift: it informs the strength of unknown conditional shift, even though it does not explain away all the distribution shift between sites. See my blog post here!
- May 2025: I’m organizing an invited session on generalizability, transportability, and distribution shift at ACIC 2025!
- Apr 2025: I gave a talk on our POPPER agent framework at the International Seminar on Selective Inference! [slides] [recording]
- Feb 2025: Imagine LLM agents for scientific discovery—agents that autonomously gather knowledge by creative reasoning and flexible tool use. How to ensure the soundness of what they acquire? We propose POPPER, a framework where LLM agents design sequential experiments, collect data, and accumulate statistical evidence to validate a free-form hypothesis with error control!
- Sept 2024: Outputs from black-box foundation models must align with human values before use. For example, can we ensure only human-quality AI-generated medical reports are deferred to doctors? Our paper Conformal Alignment is accepted to NeurIPS 2024!
- Sept 2024: My paper on optimal variance reduction in online experiments (2021 internship project at LinkedIn) receives the 2024 Jack Youden Prize for the best expository paper in Technometrics! Thank you, ASQ/ASA!
- March 2024: How to quantify the uncertainty for an “interesting” unit picked by a complicated, data-driven process? Check out JOMI, our framework for conformal prediction with selection conditional coverage!
- Sept 2023: I’ll be giving a seminar at Genentech on leveraging Conformal Selection [1, 2] for reliable AI-assisted drug discovery.
- Sept 2023: Scientists often refer to distribution shifts when effects from two studies differ, e.g. in replicability failure. Do they really contribute? See our preprint for a formal diagnosis framework. Play with our live app, or explore our data repository! I gave an invited talk about it in the Causality in Practice Conference.
Beyond academics, I love traveling and photography in my free time. See my photography gallery!
Education
Recent posts